Final Report : Power Price Prediction#

Contributors#

  • Arjun Radhakrishnan

  • Sneha Sunil

  • Gaoxiang Wang

  • Mehdi Naji

Executive Summary#

Numerous Alberta-based organizations rely heavily on energy to fuel their business operations and are in search of an effective forecasting tool that offers accurate and interpretable predictions. Our business solution precisely addresses this need by offering an interpretable and explainable data science product deployed on the cloud, specifically designed for power price prediction in the Alberta Energy Market. Our solution equips organizations with the ability to make knowledgeable decisions about their energy purchases by forecasting hourly energy prices for the next 12 hours, supplemented with confidence intervals. It is noteworthy that our forecasting system has demonstrated a remarkable 34% improvement in prediction accuracy compared to the current system[1], which only forecasts for the next 6 hours. Our solution also addresses the lack of interpretability and explainability of predictions, which is showcased in an intuitive Tableau dashboard.

Introduction#

Over the past few decades, the electricity market in Alberta province has undergone a significant transformation, shifting from regulated to competitive and deregulated environments where prices are determined by the interplay of supply and demand dynamics in a competitive marketplace. Prices are influenced by various participants in the market, including power generators, transmission companies, and retailers. Consequently, the deregulated nature of the market plays a crucial role in the volatility of power prices. To illustrate the extent of this volatility, an interactive plot below vividly showcases the substantial variations in price where we can witness the dramatic ups and downs, serving as a testament to the dynamic nature of the pool price. This serves as a striking reminder of the inherent complexity of the data science problem that needs to be addressed.

Energy-intensive industries in Alberta heavily depend on accurate price predictions to plan for future energy costs and optimize their operations. In light of the growing uncertainty caused by escalating price volatility, the significance of accurate electricity price predictions cannot be emphasized enough. These predictions play a pivotal role in enabling stakeholders, including energy buyers, to navigate the market successfully by efficiently strategizing their operations. Currently, these organizations depend on energy forecasting tool published by AESO - Alberta Electric System Operator to determine their energy costs in advance for their business operations. AESO is an independent system operator in Alberta responsible for operating the electrical grid, facilitating the competitive electricity market, and managing the entire power distribution system for the province. The power pool price for every hour is finalized by AESO based on supply and demand. However, the current energy forecasts published by AESO only provide a short-term coverage of 6 hours into the future, which is volatile and lacks interpretation or model visibility.

To reduce their expenses, companies could plan and potentially explore alternative energy options if they have access to accurate forecasts which covers a longer window and is also interpretable and explainable. To address the challenges faced by organizations in Alberta, our product offers a comprehensive solution that empowers them to effectively analyze costs, optimize energy purchases, and ultimately maximize their profits. The scientific objectives of our product are

  • Forecasting energy prices for the next 12 hours

  • Interpretability and explainability of predictions

  • Scalable and real-time prediction pipeline

  • Reactive Tableau Dashboard for visualizing the real-time results

Data Science Techniques#

Our project utilized two primary data sources:

Open-source Tableau Data: We had access to historical hourly time series data published by AESO in Tableau.

Open-source API Data: AESO provides a public API service that grants us access to real-time and historical data for specific selective features.

Our initial dataset consisted of approximately 110 features and ~ 72,000 rows, covering various aspects such as power generation/supply by fuel type, energy load distribution across regions, import and export data, and system marginal price data. The primary target we focused on forecasting was the power pool price (CAD). This price represents the balanced average power price per hour for the Alberta province as determined by AESO. It is capped between 0 and 1000 to ensure that the Alberta electricity market is stable and fair. Our feature set predominantly comprises numerical data, with one exception of an ordinal feature that was engineered by us.

Examining the plots reveals consistent trends in energy prices. On weekdays, prices are higher during peak hours (8 am to 8 pm) due to increased demand from business operations. Weekdays also show lower prices during off-working hours. However, weekends have a different pattern, with higher prices in the evenings. These observations are supported by the autocorrelation function plots, which clearly demonstrate daily seasonality in energy prices. Notably, Tuesdays have the highest average prices among weekdays. To capture the combined effects of day of the week and peak/off-peak hours, an ordinal feature called ‘weekly_profile’ was created to represent time-related variables and energy pricing dynamics.

This is an autocorrelation function plot with 50 lags for the pool price. We can clearly see a daily seasonality in this plot (for every 24 hours). Note : ACF plot depicts the correlation between the price and its lagged values

Data preprocessing#

For the modeling purpose, we partitioned our data into two subsets: training set from January 2021 to January 2023 and testing set from February - May 2023. Given the absence of real-time data for all features through the API, we leveraged historical data to simulate a real-time prediction system. In our pursuit to accurately predict future prices based on historical values of influential factors, we transformed the time-series data of all the features into a tabular format by creating lagged versions of both the target variable and the relevant features.

Feature Selection and Engineering#

In the process of feature selection out of 110 features, our primary strategy involved examining the correlations between various features and the price. Considering the importance of interpretability in our model, we also conducted comprehensive market research and engineered several key features showing a significant correlation with the price. One such feature is the gas reserve margin reserve, a buffer/reserve of gas, readily available to generate electricity to meet sudden load demands and peak hours. As evidenced in our data visualizations, a dwindling gas reserve tends to correspond with an increase in price. In the case of gas supply mix which is the proportion of energy generation using gas by the total energy generation, when the supply is mostly using gas, the price increases as gas is costly compared to the rest of the sources.

For more information about the key engineered features, please check out the glossary

We further refined our feature set by leveraging the coefficients from training an Elastic Net CV model and the feature importances deduced from training a Random Forest Regressor model.

Pursuing a second strategy, we investigated the correlation between lagged features and future prices projected for periods ranging from 1 to 12 hours. We identified features exhibiting correlations of absolute value greater than 0.3 and incorporated them into our feature set. Interestingly, both strategies resulted in almost identical sets of features.

Modelling Strategy#

Since the price is extremely volatile, tackling our first scientific objective which is forecasting energy for the next 12 hours seemed like a complex task. Hence, we needed models that are apt for time series forecasting, which can pick up temporal patterns. As a baseline model for our problem, we chose the SARIMA (Seasonal Autoregressive Integrated Moving Average) model for univariate forecasting as it is a popular classical time series forecasting technique that incorporates both autoregressive and moving average components along with seasonality. It also supports probabilistic forecasting and the generation of confidence intervals. To assess model performance, we selected the Root Mean Square Error (RMSE) as our evaluation metric because it aligns with our project’s focus on interpretability and is easily understandable as the error value is in the same scale as the actual values measured in CAD. Our evaluation of the ARIMA model resulted in an average error of 83.85 CAD, with a standard deviation of approximately 73.11 CAD. However, recognizing that our problem involved multi-step forecasting, we decided to transition to more sophisticated machine-learning models to improve the accuracy of our forecasts. In this approach, we incorporated the key engineered features in addition to using past price values.

For our forecasting horizon of 12 steps, we implemented the direct strategy, which involved training 12 individual models using the same historical data until the cut-off point, each responsible for predicting the price for a specific timestep within the horizon (1, 2, 3… 12). The cut-off hour refers to the specific point in time up to which you use the data to train your model. For example, Model 1 would consistently predict the power price for the next timestep, while Model 12 would forecast the price for 12 timesteps into the future. By adopting this approach, we avoided the accumulation of errors and controlled error propagation that can occur in the recursive strategy. Additionally, when a new data point became available, we updated the cut-off time by 1 hour and retrained all 12 models using the most recent data. This allowed us to continuously incorporate real-time data and ensure reliable future predictions.

To build our base pipeline, we leveraged the sktime package, a widely-used package that supports time series forecasting. However, we encountered a challenge where refitting the data to all 12 models became time-consuming and affected real-time responsiveness. To address this problem, we implemented custom code that significantly reduced the refit times from approximately 4.5 minutes to less than 0.5 seconds. This optimization was a significant achievement for us, allowing us to overcome the computational burden and enhance the efficiency of our pipeline.

Cross Validation Split#

Our training data spanned from January 1st, 2021, to January 31st, 2023. To capture any seasonality in price variation over time, we initially trained our models using two years’ worth of data. For cross-validation, we utilized the entire month of January 2023, creating 63 folds to validate our models. Since our data is time series data, preserving the temporal order of the data in each fold was crucial. The first fold consisted of an initial training window of two years and a validation set spanning 12 hours of data. We made predictions for these 12 hours and compared them with the actual prices in the validation set to calculate the errors. We then expanded the training window by including the 12 hours of data from the validation set and proceeded to predict the next 12 hours. This process was repeated for a total of 63 folds.

Experimented models#

Our initial choice for modeling was Elastic Net CV, a linear regression model that effectively handles multicollinearity and supports feature selection. Given our emphasis on interpretability, a linear model was a suitable option as it offers a straightforward interpretation of coefficients and relationships between variables. In addition to Elastic Net CV, we also considered XGBoost and Light GBM which are powerful gradient-boosting algorithms that excel in efficiency and have a strong track record in various domains. LightGBM specifically satisfied all our requirements, including accuracy, scalability, managing multicollinearity, and efficient model fitting. It particularly excelled when loaded on a GPU, enabling faster training times for large datasets. Notably, LightGBM supports warm initialization, enabling rapid model refitting on new data, which was essential for our real-time forecasting scenario.

Cross Validation Results#

Model RMSE(CAD) RMSE SD(CAD)
0 LightGBM 82.962592 79.157233
1 ARIMA 88.812002 67.579395
2 Elastic Net 89.302110 66.732290
3 XGBoost 94.000088 70.299748

As seen from the above results, all the models demonstrated good performance according to the RMSE metric, but Light GBM excelled, considering our specific needs. Its computational efficiency, rapid fit times, and warm state capability - which retains and updates previously learned data - made it ideal for quick model updates when AESO published new data. Additionally, Light GBM captured spikes in prices, whereas the other models struggled to capture such sudden changes, resulting in flatter predictions despite their lower RMSEs. Furthermore, the standard deviation of errors was smaller in comparison to the other models. Light GBM also supported quantile regression, enabling us to generate confidence intervals for our predictions. All in all, its superior performance and adaptability made Light GBM the best out of the experimented models for our forecasting pipeline.

Interpretibility of predictions#

To achieve our second scientific objective of obtaining feature importance and interpreting the predictions of our model, we relied on the SHAP (SHapley Additive exPlanations) framework. SHAP enables us to explain and interpret the individual predictions made by our model. For each prediction the model makes, we can easily obtain the SHAP values of each of the features for this prediction, which will quantify the impact and contribution of each feature for the prediction. To quantify the uncertainty of our predictions, we utilized quantile regression to obtain the 95% confidence intervals.

For more information, please check the Appendix section.

Test Results#

Coming to the most awaited test results, our modeling pipeline outperformed AESO’s model across all timesteps for the time range of May 25 - May 31st as shown in the table. Our prediction pipeline demonstrates superiority not just in short-term 1-step and 2-step forecasts but continues to maintain lower RMSE values right up to 12-step forecasts. It is noteworthy that the RMSE of our model consistently stays in a relatively lower range, even at higher step predictions. This implies that our model’s accuracy doesn’t degrade significantly with the extension of the forecast horizon, which is a highly desirable feature in multi-step forecasting models.

We can see below the stepwise errors for the time range of May 25 - May 31st. Our model predictions are better than AESO predictions by ~34% even while predicting twice the number of steps into the future. ​

aeso_error = pd.read_csv("../../results/aeso_error.csv")
aeso_error = aeso_error.rename(columns={'Unnamed: 1': 'Step RMSE'})
aeso_error = aeso_error[['RMSE (CAD)', 'Step RMSE']].T
aeso_error.columns = aeso_error.iloc[1]
aeso_error = aeso_error[0:1]
aeso_error
Step RMSE 1 Step RMSE 2 Step RMSE 3 Step RMSE 4 Step RMSE 5 Step RMSE 6 Step RMSE
RMSE (CAD) 117.53 120.57 158.48 176.31 220.64 253.08
Here, we calculated the RMSE by predicting a 12-step forecast for each hour and computing the RMSE between the predicted values and the actual values.
This process is repeated for every hour in the test set, and the resulting RMSEs are averaged.
Using this method the average RMSE from Feb 01 2023 to Feb 02 2023 is 88.89 CAD.

The predictions from Feb 01 2023 to Feb 02 2023 is shown below -
The step wise RMSE for our test set is shown below - 
The training data is from Dec 01 2022 to Jan 31 2023.
The test data is from Feb 01 2023 to Feb 02 2023.
RMSE(CAD)
1 step rmse 29.01
2 step rmse 34.94
3 step rmse 37.21
4 step rmse 37.10
5 step rmse 47.04
6 step rmse 84.25
7 step rmse 109.36
8 step rmse 140.60
9 step rmse 163.57
10 step rmse 161.72
11 step rmse 165.17
12 step rmse 166.32

Data Product#

Our data products offer a scalable and real-time prediction pipeline that accurately forecasts energy prices for the next 12 hours. These predictions come with a high level of explainability, allowing our partners to gain valuable insights and make well-informed decisions regarding their energy purchases. To facilitate this, we have developed a reactive tableau dashboard that empowers users to easily access and visualize real-time results.

The dashboard is thoughtfully structured into three primary sections to enhance user experience. At the top, the first section features a comprehensive 24-hour energy price timeline chart. This chart provides a holistic view of energy prices by displaying actual prices for the previous 12 hours and forecasted prices for the upcoming 12 hours. Additionally, each hourly prediction is accompanied by a 95% confidence interval, offering a clear representation of the margin of error associated with each forecast. By leveraging this chart, stakeholders can readily assess future price forecasts and gain valuable insights into their projected costs.

Moving to the lower left, the second section presents a dynamic bar chart that updates as users hover over specific predictions in the first chart. The chart highlights the top four features and offers a deeper understanding of the key factors driving each prediction.

Finally, the third section showcases a time series plot illustrating four significant global factors that exhibit a correlation with energy prices.

More info on architecture, please check the Appendix section.

tableau_dashboard

Conclusion and Recommendations#

In Alberta’s liberalized market, predicting power prices is an intricate task, as it relies on a balance between supply and demand, intertwined with multiple influencing factors. The high volatility mixed up with seasonality patterns, coupled with the lack of clear patterns, further compounds the challenge of accurate forecasting. These challenges become pronounced as our prediction window extends to 12 hours. Data availability also poses a hurdle, particularly for real-time, hourly power price predictions. Despite identifying significant features, their acquisition in an hourly, real-time format remains difficult. Another crucial aspect of our project has been the constant tug-of-war between model accuracy and interpretability. While simpler models like ARIMA enhance interpretability, they lack the complexity to effectively address the problem. Our current LightGBM model strikes a middle ground, but the ideal balance remains an exploratory challenge with numerous potential models yet to be examined.

In dealing with the data challenges encountered during the project, we developed an Azure function capable of scraping real-time power generation data from hourly online reports after we received approval from AESO. This solution addresses the issue of data availability, and the function can be added to our pipeline in the future. We also encountered challenges when using the sktime package. Moving forward, it could be advantageous to develop certain components internally, which would allow us to optimize our pipelines and have greater control over them.

Despite the numerous challenges encountered, we were successful in achieving our scientific objectives. We developed a product capable of delivering 12-hour forecasts with a high level of interpretability and a defined margin of error, empowering energy buyers to make well-informed decisions with greater confidence which will potentially help them in business planning and optimization. The journey was arduous, but as the saying goes, “Nothing worth having comes easy.”

Appendix#

Interpretibility of Predictions#

The local interpretability of predictions obtained using SHAP values helps us understand the factors driving individual predictions. To establish a reference point for comparison, we used a base value, which represents the average value of predictions made by the model. This base value acts as a benchmark, allowing us to evaluate the impact of each feature in relation to the model’s expected output. The SHAP values can be positive or negative, indicating the direction and magnitude of influence. Positive SHAP values indicate that a feature positively influenced the prediction, pushing the prediction value higher than the base score. Conversely, negative SHAP values indicate a negative influence on the prediction. In our dashboard or visualizations, we explicitly showcased the percentage increase or decrease contributed by each feature compared to the base score. This provides a clear understanding of how much each feature contributes to the current prediction, allowing for intuitive interpretation and explanation of the model’s behavior. For obtaining the confidence interval, since our final model, LightGBM supports quantile regression, we trained two separate LightGBM models with the objective as quantile regression. The first LightGBM model was trained to predict the upper limit of the confidence interval. To achieve this, we configured the desired quantile to be 0.975, representing the 97.5th percentile of the distribution. The second LightGBM model was trained to predict the lower limit of the interval, with the desired quantile set to 0.025, representing the 2.5th percentile. After training the models, we utilized the predict() method to obtain the predicted quantiles for each prediction. These predicted quantiles represented the upper and lower limits of the 95% confidence interval. This provided a measure of uncertainty and allowed us to communicate the level of confidence associated with our forecasts which facilitated better decision-making and risk assessment.

Deployment Architecture#

In order to make our machine learning pipeline scalable, we have deployed our product in Databricks. Our architecture involves two main jobs running within Databricks. We also have two storage units - one for storing the predictions for the dashboard and the second one for archiving all the predictions made by the model. The first job serves as our initial training pipeline, responsible for training the model using the training dataset and subsequently saving the trained model. We also store the model predictions, upper and lower limits, as well as the prediction explanations. These details are stored in both the real-time predictions table and the archive table.

To ensure the continuous availability of updated insights, our Tableau dashboard regularly retrieves new data from the real-time predictions table and refreshes its charts accordingly.

In addition to the initial training pipeline, we have implemented an update job that runs on an hourly basis. This job retrieves the new actual power price for the past hour, typically published by the AESO API, every hour. However, due to current limitations in data availability, we simulate this process using historical data. The update job leverages these new values to refit the data in all 12 models, thereby generating the next set of predictions. These updated predictions are then seamlessly integrated with Tableau, allowing us to promptly update our visualizations and plots.

To ensure the timely and accurate generation of real-time hourly predictions, the update job operates on a timer trigger, guaranteeing that our predictions and other artifacts remain up to date and aligned with the latest data.

Databricks is highly scalable and capable of handling data-intensive processes, making it an ideal computing engine. Conversely, Tableau stands out for its user-friendly maintenance and extensive built-in features, making it an efficient tool for visualizing predictions. Tableau seamlessly integrates with various data sources, enabling easy access and analysis within a unified interface. However, there are certain challenges to consider. Currently, our Databricks cluster computation speed is slower than personal laptops due to a lack of GPU support in the current configuration. Additionally, certain features available in Python’s Plotly are either absent or more challenging to implement in Tableau, given its non-code-based approach. The use of the Dash library in combination with Plotly could have provided more advanced and customizable visualizations, which unfortunately we were unable to explore due to time constraints.

architecture